Regularization in network optimization via trimmed stochastic gradient descent with noisy label

نویسندگان

چکیده

Regularization is essential for avoiding over-fitting to training data in network optimization, leading better generalization of the trained networks. The label noise provides a strong implicit regularization by replacing target ground truth labels examples uniform random labels. However, it can cause undesirable misleading gradients due large loss associated with incorrect We propose first-order optimization method (Label-Noised Trim-SGD) that uses example trimming order remove outliers based on loss. proposed algorithm simple yet enables us impose label-noise and obtain effect than original methods. quantitative analysis performed comparing behavior noise, trimming, algorithm. also present empirical results demonstrate effectiveness our using major benchmarks fundamental networks, where has successfully outperformed state-of-the-art

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributed Stochastic Optimization via Adaptive Stochastic Gradient Descent

Stochastic convex optimization algorithms are the most popular way to train machine learning models on large-scale data. Scaling up the training process of these models is crucial in many applications, but the most popular algorithm, Stochastic Gradient Descent (SGD), is a serial algorithm that is surprisingly hard to parallelize. In this paper, we propose an efficient distributed stochastic op...

متن کامل

The Regularization Effects of Anisotropic Noise in Stochastic Gradient Descent

Understanding the generalization of deep learning has raised lots of concerns recently, where the learning algorithms play an important role in generalization performance, such as stochastic gradient descent (SGD). Along this line, we particularly study the anisotropic noise introduced by SGD, and investigate its importance for the generalization in deep neural networks. Through a thorough empi...

متن کامل

Ant Colony Optimization and Stochastic Gradient Descent

In this article, we study the relationship between the two techniques known as ant colony optimization (ACO) and stochastic gradient descent. More precisely, we show that some empirical ACO algorithms approximate stochastic gradient descent in the space of pheromones, and we propose an implementation of stochastic gradient descent that belongs to the family of ACO algorithms. We then use this i...

متن کامل

Stochastic Proximal Gradient Descent for Nuclear Norm Regularization

In this paper, we utilize stochastic optimization to reduce the space complexity of convex composite optimization with a nuclear norm regularizer, where the variable is a matrix of size m × n. By constructing a low-rank estimate of the gradient, we propose an iterative algorithm based on stochastic proximal gradient descent (SPGD), and take the last iterate of SPGD as the final solution. The ma...

متن کامل

Iterate averaging as regularization for stochastic gradient descent

We propose and analyze a variant of the classic Polyak-Ruppert averaging scheme, broadly used in stochastic gradient methods. Rather than a uniform average of the iterates, we consider a weighted average, with weights decaying in a geometric fashion. In the context of linear least squares regression, we show that this averaging scheme has a the same regularizing effect, and indeed is asymptotic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2022

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2022.3171910